26 February 2024

StreamSQL

StreamSQL allows to implement a low code (SQL) approach to define and operate with ad-hoc data flows, such as read from Kafka Apache Kafka is a distributed event store and stream-processing platform. Apache Kafka is a distributed publish-subscribe messaging system. A message is any kind of information that is sent from a producer (application that sends the messages) to a consumer (application that receives the messages). Producers write their messages or data to Kafka topics. These topics are divided into partitions that function like logs. Each message is written to a partition and has a unique offset, or identifier. Consumers can specify a particular offset point where they can begin to read messages. and write directly to Space Where GigaSpaces data is stored. It is the logical cache that holds data objects in memory and might also hold them in layered in tiering. Data is hosted from multiple SoRs, consolidated as a unified data model. or read from one Kafka topic and write to another Kafka topic.

Behind the scenes StreamSQL utilizes a powerful low code Flink Apache Flink is an open-source, unified stream-processing and batch-processing framework developed by the Apache Software Foundation. The core of Apache Flink is a distributed streaming data-flow engine written in Java and Scala. Flink executes arbitrary dataflow programs in a data-parallel and pipelined manner. capabilities to define a schema via Flink SQL CREATE TABLE API.

StreamSQL operation activities can be defined using standard SQL statements, such as:

Define structure of messages in a Kafka topic as a table (CREATE TABLE)
Define a data flow (stream of data or pipeline) as INSERT AS SELECT statement
Perform a join of data flow from different Kafka topics using a standard SQL join statement

One of the useful StreamSQL use cases can be IoT when continuous flow of sensors data changes is consumed from Kafka, aggregated into a summary table and pushed the aggregated summary to space for data services consumption.

For an overview of the StreamSQL menu in SpaceDeck GigaSpaces intuitive, streamlined user interface to set up, manage and control their environment. Using SpaceDeck, users can define the tools to bring legacy System of Record (SoR) databases into the in-memory data grid that is the core of the GigaSpaces system., refer to SpaceDeck StreamSQL Overview page.

For examples of statements used for Kafka to Space with Aggregation and Kafka to Kafka with Aggregation, refer to the StreamSQL Implementation page.